translated by 谷歌翻译
translated by 谷歌翻译
最近,基于水平表示的全景语义分割方法优于基于投影的解决方案,因为可以通过在垂直方向上压缩球形数据来有效地消除畸变。但是,这些方法忽略了之前的失真分布,并且仅限于不平衡的接收场,例如,接收场在垂直方向上足够,并且在水平方向上不足。不同的是,沿另一个方向压缩的垂直表示可以提供隐式失真先验,并扩大水平接收场。在本文中,我们结合了两种不同的表示,并从互补的角度提出了一种新颖的360 {\ deg}语义分割解决方案。我们的网络包括三个模块:特征提取模块,一个双向压缩模块和一个集合解码模块。首先,我们从Panorama提取多尺度功能。然后,设计一个双向压缩模块,将特征压缩为两个互补的低维表示,这些表示提供了内容感知和失真。此外,为了促进双向特征的融合,我们在合奏解码模块中设计了独特的自我蒸馏策略,以增强不同特征的相互作用并进一步提高性能。实验结果表明,我们的方法的表现优于最先进的解决方案,在定量评估上至少提高了10 \%的改进,同时显示出视觉外观上最佳性能。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
同性记估计是计算机视觉应用中的一个重要任务,例如图像拼接,视频稳定和相机校准。传统的同性恋估计方法大量取决于特征对应关系的数量和分布,导致低纹理场景中的稳健性差。相反,学习解决方案尝试学习强大的深度特征,但在具有低重叠率的场景中表现出不满意的性能。在本文中,我们通过设计上下文相关层(CCL)同时解决这两个问题。 CCL可以有效地捕获特征映射内的远程相关性,并且可以灵活地用于学习框架。此外,考虑到单位定位不能用视差将复杂的图像中的复杂空间转换表示,我们建议将多网权特征从全局预测到本地。此外,通过引入新的深度感知形状保存的损失,我们将我们的网络配备了深度感知能力。广泛的实验证明了我们在合成基准数据集和现实世界数据集中的最先进解决方案的方法的优越性。代码和模型将在https://github.com/nie-lang/multi-grid-deep-homography上获得。
translated by 谷歌翻译
This paper revisits a fundamental problem in statistical inference from a non-asymptotic theoretical viewpoint $\unicode{x2013}$ the construction of confidence sets. We establish a finite-sample bound for the estimator, characterizing its asymptotic behavior in a non-asymptotic fashion. An important feature of our bound is that its dimension dependency is captured by the effective dimension $\unicode{x2013}$ the trace of the limiting sandwich covariance $\unicode{x2013}$ which can be much smaller than the parameter dimension in some regimes. We then illustrate how the bound can be used to obtain a confidence set whose shape is adapted to the optimization landscape induced by the loss function. Unlike previous works that rely heavily on the strong convexity of the loss function, we only assume the Hessian is lower bounded at optimum and allow it to gradually becomes degenerate. This property is formalized by the notion of generalized self-concordance which originated from convex optimization. Moreover, we demonstrate how the effective dimension can be estimated from data and characterize its estimation accuracy. We apply our results to maximum likelihood estimation with generalized linear models, score matching with exponential families, and hypothesis testing with Rao's score test.
translated by 谷歌翻译
Generative AI has matured to a point where large-scale models can generate text that seems indistinguishable from human-written text and remarkably photorealistic images. Automatically measuring how close the distribution of generated data is to the target real data distribution is a key step in diagnosing existing models and developing better models. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore four approaches to statistically estimate these scores: vector quantization, non-parametric estimation, classifier-based estimation, and parametric Gaussian approximations. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of $f$-divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We conclude the paper by demonstrating its applications to other AI domains and discussing practical recommendations.
translated by 谷歌翻译
Kernels are efficient in representing nonlocal dependence and they are widely used to design operators between function spaces. Thus, learning kernels in operators from data is an inverse problem of general interest. Due to the nonlocal dependence, the inverse problem can be severely ill-posed with a data-dependent singular inversion operator. The Bayesian approach overcomes the ill-posedness through a non-degenerate prior. However, a fixed non-degenerate prior leads to a divergent posterior mean when the observation noise becomes small, if the data induces a perturbation in the eigenspace of zero eigenvalues of the inversion operator. We introduce a data-adaptive prior to achieve a stable posterior whose mean always has a small noise limit. The data-adaptive prior's covariance is the inversion operator with a hyper-parameter selected adaptive to data by the L-curve method. Furthermore, we provide a detailed analysis on the computational practice of the data-adaptive prior, and demonstrate it on Toeplitz matrices and integral operators. Numerical tests show that a fixed prior can lead to a divergent posterior mean in the presence of any of the four types of errors: discretization error, model error, partial observation and wrong noise assumption. In contrast, the data-adaptive prior always attains posterior means with small noise limits.
translated by 谷歌翻译
To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.
translated by 谷歌翻译